Skip to content

fixed and enabled performance-faster-string-find clang-tidy warning#4769

Merged
danmar merged 1 commit intocppcheck-opensource:mainfrom
firewave:tidy-find
Feb 8, 2023
Merged

fixed and enabled performance-faster-string-find clang-tidy warning#4769
danmar merged 1 commit intocppcheck-opensource:mainfrom
firewave:tidy-find

Conversation

@firewave
Copy link
Copy Markdown
Collaborator

@firewave firewave commented Feb 4, 2023

Another long overdue try to get this applied.

Here's some actual benchmarks this time:
https://quick-bench.com/q/K1KmsDawQ4PpwwAMTqHdprlZeUU
https://quick-bench.com/q/zEJnrbF5gPaa1o7d32oQzu8UeU8

@danmar
Copy link
Copy Markdown
Collaborator

danmar commented Feb 7, 2023

Intuitivitely it should be faster. Do you have some measurement?

I don't think such micro optimisations will make any big difference. I think the focus should be rewriting algorithms to make it faster.

@firewave
Copy link
Copy Markdown
Collaborator Author

firewave commented Feb 7, 2023

I attached links to benchmarks. There's no way to make the algorithm faster. It's a suboptimal implementation in the standard,

In the matchcompiler we optimize s == "1" to s.size() == 1U && s[0] == '1'. So we jump through bigger hoops in other code to get slight performance improvements. This is quite similar and in this case the code stays virtually identical.

@danmar
Copy link
Copy Markdown
Collaborator

danmar commented Feb 8, 2023

This is quite similar and in this case the code stays virtually identical.

We get more and more micro optimisations... it's not good and it gets worse and worse.

I don't want to unroll all the micro optimisations we made at this point. But if we can stop making more it would make me feel better.

I am not strongly against this particular PR we can reluctantly merge this now. I don't feel there is any payoff. I wonder how I can stop the focus on micro optimisations. I would assume that we can rewrite parser code and get significant speedups.

@firewave
Copy link
Copy Markdown
Collaborator Author

firewave commented Feb 8, 2023

Well the speed-up is above a micro optimization if it delivers a performance of 1.8. The linked benchmark is measuring actual time and not instruction count. Which is what we actually should be profiling for. This is just an easy "win" compared with the other stuff I did. And we have tooling to support use with.

Just "rewriting" parsers and other things will lead to actual micro-optimizations like hot and cold attributes, refactoring to avoid cache misses and other things which make the code harder to read. That might also vary with the compiler you are using. But all those speed-ups would not help much as there's still the ValueFlow...

I am already trying to find ways to get rid of the matchcompiler and have the compilers behave better with common patterns.

@danmar
Copy link
Copy Markdown
Collaborator

danmar commented Feb 8, 2023

Well the speed-up is above a micro optimization if it delivers a performance of 1.8.

Ok thanks I did not see that benchmark. However I assume the difference in Cppcheck analysis time will be very small.

Maybe micro optimisations are more OK in the matchcompiler output. If there is something we can safely do. Whatever it means to readability does not matter then.

The linked benchmark is measuring actual time and not instruction count. Which is what we actually should be profiling for.

I agree

Just "rewriting" parsers and other things will lead to actual micro-optimizations

In my opinion you are not talking about "rewriting" then. I am not talking about rearranging code to avoid cache misses.

But all those speed-ups would not help much as there's still the ValueFlow...

I agree. We need to be much more careful in the ValueFlow.

@danmar danmar merged commit 8ef14da into cppcheck-opensource:main Feb 8, 2023
@firewave firewave deleted the tidy-find branch February 8, 2023 20:08
@firewave
Copy link
Copy Markdown
Collaborator Author

firewave commented Feb 8, 2023

The linked benchmark is measuring actual time and not instruction count. Which is what we actually should be profiling for.

I agree

I am learning and improving my approaches as well as refining my Cppcheck and LLVM tickets.

Just "rewriting" parsers and other things will lead to actual micro-optimizations

In my opinion you are not talking about "rewriting" then. I am not talking about rearranging code to avoid cache misses.

I took a short look at cache misses and we have buckets of them. They primarily seem to be related to lambda usage but I couldn't really figure out where exactly. I am just not familiar with cache profiling. But it might also be some shortcomings because we cannot move variables into lambdas yet.

Maybe micro optimisations are more OK in the matchcompiler output. If there is something we can safely do. Whatever it means to readability does not matter then.

I would really love if the compiler would just generate optimal code for std::string usage and we could drop (most of) that. It's not like that is some kind of corner case...

@firewave
Copy link
Copy Markdown
Collaborator Author

firewave commented Feb 8, 2023

I will do some before and after profiling of these changes still.

@firewave
Copy link
Copy Markdown
Collaborator Author

Good news. perf appears to be working in WSL2 now so it should now be possible to do actual profiling based on cycles instead of instructions. That value is obviously not as stable as the instruction count. But it also includes the actual used time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants